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Abstract. The paper is concerned with a dissipativity theory and robust performance analysis of discrete-time stochastic 
systems driven by a statistically uncertain random noise. The uncertainty is quantified by the conditional relative entropy of 
the actual probability law of the noise with respect to a nominal product measure corresponding to a white noise sequence. 
We discuss a balance equation, dissipation inequality and superadditivity property for the corresponding conditional relative 
entropy supply as a function of time. The problem of minimizing the supply required to drive the system between given 
state distributions over a specified time horizon is considered. Such variational problems, involving entropy and probabilistic 
boundary conditions, are known in the literature as Schrodinger bridge problems. In application to control systems, this 
minimum required conditional relative entropy supply characterizes the robustness of the system with respect to an uncertain 
j^jfjl noise. We obtain a dynamic programming Bellman equation for the minimum required conditional relative entropy supply and 

establish a Markov property of the worst-case noise with respect to the state of the system. For multivariable linear systems 
with a Gaussian white noise sequence as the nominal noise model and Gaussian initial and terminal state distributions, the 
minimum required supply is obtained using an algebraic Riccati equation which admits a closed-form solution. We propose 
a computable robustness index for such systems in the framework of an entropy theoretic formulation of uncertainty and 
provide an example to illustrate this approach. 
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1. Introduction. Design of feedback control for stochastic systems, which is usually aimed at 
suppressing the effect of random disturbances on the performance of the system, often confronts the 
situation where the statistical characteristics of the noise are not known precisely. Such statistical 
uncertainty can arise both from inaccuracies in prior probabilistic information on the noise and from 
variability of the random environment in which the control system operates. An approach which 
00 1 is often practiced in optimal control design in this case (see, for example, [18, 19]), is to employ 

a relatively simple model for the noise (sometimes upon augmenting the state of the system to 
qq ■ incorporate a noise shaping filter) and to optimize the feedback in the closed-loop system for the 

\ case of the nominal noise. 

■ A different paradigm is employed by robust control approaches, such as in [31], which are 

aimed at achieving "uniformly" guaranteed performance of the system over a class of uncertainties 
(especially, in worst-case scenarios). This is at the expense of loosing the optimality in the nominal 
noise case (which often plays the role of a "center" of the uncertainty class). However, the robust 
controller itself, and the performance of the resulting closed-loop system, depends on the particular 
description of uncertainty which was used to design them. That is, the robustness of the closed-loop 
system, which is secured against a particular class of uncertainties, may be less satisfactory with 
respect to another class of uncertainties. 

Therefore, the problem of robust performance analysis for a given closed-loop system with re- 
spect to different classes of uncertainties is important regardless of whether the system has been 
obtained from a robust or optimal control design methodology. More precisely, the problems of in- 
terest here are concerned with the performance deterioration of a system subject to uncertain random 
noise in comparison to the performance of the system when subject to the nominal noise. The sta- 
tistically uncertain noise can be viewed as resulting from the actions of a hypothetical noise player 
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who has access to the current state of the system and employs this information in generating the 
future noise inputs in order to drive the system away from its nominal behavior. In this regard, 
an important approach, which constitutes an important part of recent robust stochastic control and 
filtering theory, is provided by formulations of statistical uncertainty using entropy theoretic con- 
structs [10, 23, 30, 31, 32, 34, 38, 39, 40, 41, 47] (see also [7, 11, 12] for their connections with 
the risk-sensitive control). Although entropy and related concepts have a long history in equilibrium 
statistical mechanics [22], their application to robust control are more reminiscent of nonequilibrium 
statistical physics formulations and also have a bearing on information theory [8, 13]. The deviation 
of the actual noise probability law from the nominal noise model, which results in a corresponding 
deviation of the system from the equilibrium probability distribution under the nominal noise, can 
be interpreted in terms of the supply-storage relations of dissipativity theory [45, 46]. 

The aim of the present paper is to combine the dissipativity theory viewpoint with an entropy 
theoretic formulation of statistical uncertainty in order to develop a tractable robustness index for 
discrete-time stochastic systems driven by an uncertain random noise. The uncertainty is quantified 
by the conditional relative entropy [13] of the actual noise probability law, given the initial state 
of the system, with respect to a nominal product measure corresponding to a white noise sequence, 
independent of the initial state. This quantity measures not only the deviation of the noise from its 
nominal model but also the extent to which the noise player uses knowledge of the current state of the 
system for future noise generation. The conditional relative entropy can therefore be interpreted as 
a resource which the noise player spends economically in performing the role of driving the system 
away from its nominal behavior. This nominal behavior is characterized in terms of the existence of 
a nominal invariant state distribution which the system would have in the nominal white noise case. 

Although the conditional relative entropy supply is apparently less symmetric in time than the 
unconditional relative entropy, it satisfies a balance equation which involves time reversal through 
a Bayesian term [4, 15]. A related dissipation inequality describes the influence of the conditional 
relative entropy supply for the noise player on the deviation of the system from the nominal invariant 
state distribution. The deviation of the system from the nominal invariant state distribution is also 
measured in relative entropy terms and plays the role of a storage function. As a function of time, the 
conditional relative entropy supply is superadditive [33] in contrast to its deterministic counterpart in 
[45, 46] (which is additive as the integral of a supply rate over the time interval). However, additivity 
is recovered for a class of noise sequences which are Markov with respect to the state of the system 
and play an important role as economical noise strategies. 

A problem of minimizing the conditional relative entropy supply required for the noise player to 
drive the system between given initial and terminal state distributions over a specified time horizon 
is then considered. Variational problems, which are concerned with entropy minimization under 
such probabilistic boundary conditions, are known as Schrodinger bridge problems [2, 29]. These 
problems are usually treated in the context of reciprocal processes, that is, Markov random fields 
on the time axis [17]; see also [1, 5, 9, 25, 43] for continuous time formulations. Such problems 
have also been studied for quantum systems [3] using the formalism of stochastic mechanics [27], 
and conventional quantum mechanical settings [29]. In application to robust performance analysis, 
the minimum required conditional relative entropy supply characterizes the robustness of the system 
with respect to the uncertain noise. Indeed, the larger is the required supply, the more "sluggish" 
the system is with respect to the actions of the noise player. We obtain a dynamic programming 
Bellman equation for the minimum required conditional relative entropy supply and establish the 
Markov property of the corresponding worst-case noise with respect to the state of the system. 

A related state distribution tracking problem leads to the minimum conditional relative entropy 
supply rate (per time step), which is required for the noise player to maintain the system in a given 
state distribution. In combination with a loss functional (which measures the system performance 
deterioration associated with the deviation from the nominal invariant state distribution), the mini- 
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mum supply rate, required to achieve a specified level of the system performance loss, provides a 
useful robustness index. 

The specialization of the above results to the case of multivariable linear systems with a white 
Gaussian nominal noise sequence and Gaussian initial and terminal state distributions, allows the 
minimum required supply to be determined using an algebraic Riccati equation which admits a 
closed-form solution. For a class of one-step reachable linear systems, in the framework of the en- 
tropy theoretic description of uncertainty, we propose a particular robustness index associated with 
the increase in a weighted second moment of the state variables as the loss functional. Similar, 
though different ideas, which combine the second moment increase with entropy theoretic formula- 
tions of uncertainty, can be found in [7, 10, 23, 32, 34, 42, 39, 40]. The computation of the robust- 
ness index is reduced to solving two coupled algebraic equations in a matrix and a scalar parameter, 
which can be carried out numerically by using, for example, homotopy methods. As an illustration, 
we provide an explicit calculation of the robustness index for one-dimensional linear systems. 

The paper is organized as follows. Section 2 specifies the class of uncertain stochastic systems 
being considered. Section 3 describes the nominal white noise model and the associated nominal 
invariant state distribution of the system. Section 4 specifies a measure of statistical uncertainty 
as the conditional relative entropy of the actual noise with respect to the nominal noise. Section 5 
discusses a dissipation inequality and a superadditivity property for the conditional relative entropy 
supply and introduces Markov noise strategies. Section 6 describes a procedure which leads to 
a Markov noise strategy with a decreased conditional relative entropy while preserving the state 
distributions of the system. Section 7 employs this procedure to establish a dynamic programming 
Bellman equation for the minimum conditional relative entropy supply, required to drive the system 
between given initial and terminal state distributions, and introduces a system robustness index. 
Sections 8 to 12 are concerned with the case of linear dynamics and a white Gaussian nominal noise 
sequence. Section 8 establishes conditions for the reachability of Gaussian state distributions of the 
linear system. Section 9 reduces the problem of computing the minimum required supply for the 
case of Gaussian boundary conditions to an algebraic Riccati equation. A closed-form solution of 
this equation is given in Section 10 and is used in Section 1 1 for computing the robustness index for 
a class of linear systems. Section 12 provides an example which explicitly calculates the robustness 
index for a one-dimensional linear system. 

2. Stochastic systems with statistically uncertain noise. We consider a discrete-time system 
with a state signal X := (X k ) k ^o, driven by a noise input W := (W k ) k ^Q. In order to capture various 
special cases in a general formulation, the values X k and W k of these signals (at the kth time step) 
are assumed to belong to Polish (complete separable metric) spaces 3€ and W , endowed with Borel 
CJ-algebras X and 20, respectively. The dynamics of the system in the state space 3£ are governed 
by a time-invariant equation 

X k+l =f{X k ,W k ), k = 0,1,2,..., (2.1) 

where / : X x W — > 3£ is a given Borel measurable one-step state transition map. Thus, the states 
of the system at any two moments of time are related by 

X, = F f _,(y rt _i) = F,-,(X l ,W Kt - 1 ) = F, ,:A\.U;... .,W t -i), Q^s^t. (2.2) 

Here, F k : X x W k — > X denotes the k step state transition map, which satisfies the recurrence 
relation 

Ft+i(xo, wo, • • • , w k ) = f{F k (xo, w , . . . , w k - i),w k ) (2.3) 

for all xq £ X and wq, . . . , w k £ W, with the initial condition that Fq is the identity map on the state 
space X . Also, for the time interval [s,t], 

Y s , := {X s , W tt ) = (X S ,W S , ...,W t ) (2.4) 
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denotes the state-noise sequence which is formed from the initial state X s of the system and the noise 
sequence 

W s .,:=(W s ,...,W t ). (2.5) 

Randomness is introduced into the system (2.1) by assuming that the initial state Xq and the noise 
sequence W are random elements. Their joint probability distribution Px ,w is a probability measure 1 
on the measurable space ($F x W°°,X x 20°°). Accordingly, the state sequence X is a ^' "-valued 
random element with a probability distribution Px on (3£°°, X°°). Since X depends on Xq and W in 
a deterministic way, as described by (2.2) and (2.3), with the map (Xq,W) H> X being completely 
specified by the one-step state transition map /, the probability distribution Px can be expressed in 
terms of Px .w- In particular, consider the state distribution 

Pk := Px k (2.6) 

of the system at time k, that is, an appropriate marginal probability distribution of X^ on the measur- 
able space (JT, X) which corresponds to P^. In view of (2.2), the state distributions (2.6) are related 
to the probability distributions 

Qs.t := Pr„-i = Px s ,w t: ,-i = Pxs,w s w t -i (2.7) 

of the state-noise sequences (2.4), that is, the joint probability distributions of X s and W s:t -\ on 
xW-\XxW'- s ): 

Pt = Q s ,t°F t -_\. (2.8) 

The right-hand side of (2.8) is the image measure [33, pp. 51-52] of the probability distribution 
Q SJ under the t — s step state transition map F t — S , with F^ l (S) := {y S 3C x W k : F^iy) G 5} the 
pre-image of a set S E X. In turn, Q s t in (2.7) is completely specified by the initial state distribution 
P s and the conditional probability distribution P Ws t _ l i Xs of the noise sequence W s:r _ i given X s in view 
of the chain rule for probability measures 

Q S)t (dxxdw)=P s (dx)P Ws . t _ llXs (dw\x), x€3r,weW'- s . (2.9) 

The equations (2.1) may describe the dynamics of a closed-loop system obtained by applying a given 
feedback controller to a given plant, in which case X incorporates both the plant and controller state 
variables. Then, the plant is subject to an external random noise W. The design of such a controller 
often employs a relatively simple statistical model for the noise and is aimed at suppressing the 
influence of the noise on the closed-loop system performance. Although the nominal noise model 
is not guaranteed to be accurate, the feedback is usually developed so as to make the system "well- 
behaved" at least under the nominal noise (for example, by an appropriate choice of the map / in 
(2.1)). Whereas the meaning of this depends on a specific control context, the property of being 
well-behaved (which is pursued by the control designer) is understood here as the existence of an 
invariant probability measure for the system state sequence X under the nominal noise. 

3. Nominal noise model and nominal invariant state distribution. A typical nominal noise 
model is that W is a "white noise" sequence of independent identically distributed random elements 
which are also independent of Xq. 



'We denote by the probability distribution of a random element and by Pjsu the conditional probability distribution 
of £, with respect to another random element rj, with | and 7] taking values in Polish spaces. Thus, P^in(S y) is a probability 
measure of a Borel set S and a Borel measurable function of v. The joint probability distribution of C, and r\ is denoted by 
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DEFINITION 3.1. Suppose R is a given probability measure on the measurable space (W ,W). 
The noise W is called nominal ifWo,W\,... are independent R-distributed random elements, in- 
dependent of the initial state of the system Xq, so that the corresponding conditional probability 
distribution is a product measure 

P* w \ Xo =R~=RxRx---. (3.1) 

The probability measure P\y\x Q m (3-1)' under which the noise W has a simple statistical struc- 
ture (specified completely by the nominal marginal distribution R of Wjt), plays the role of a model 
for the unknown actual noise probability measure Pw|x - Under the nominal noise defined in Defi- 
nition 3.1, the state sequence X is a homogeneous Markov chain with transition probability measure 

G(S\x):=R(f(x,-)-\S)), SeX, i£ J, (3.2) 

where f(x, - (S) := {w £ W : f(x,w) GS}. In this case, the state distributions P^ from (2.6) satisfy 
the recurrence equation 

Pk+i(S) = J G(S | x)Pk(dx) = (P k x R){r\S)), (3.3) 

where /~ 1 (S) : = { w) e x : /(x, w) € S} is the pre-image of a set S € X under the one-step 
state transition map /. An invariant measure for the Markov chain X is a probability measure P* 
on (i£T,£) which is a fixed point of the linear integral operator described by the right-hand side of 
(3.3). That is, 

P, = {P,xR)of- 1 . (3.4) 

By induction, (2.8) allows (3.4) to be extended to the image measure under the k step state transition 
map Fk in (2.2) and (2.3) as 

P* = (P*xR k )oF k -\ k>0. (3.5) 

DEFINITION 3.2. An invariant measure P* of the Markov chain X under the nominal noise from 
Definition 3.1 in the sense of (3.4) is referred to as a nominal invariant state distribution for the 
system. 

In what follows, we assume that a nominal invariant state distribution P* for the system exists, 
though is not necessarily unique. Any such P* is an equilibrium point for the state distributions 
Po, Pi, ... of the system, governed by (3.3) under the nominal noise. General criteria for the existence 
of invariant measures for Markov chains are beyond the scope of the present paper. However, we will 
describe a version of Harris's theorem from [14, 24], which guarantees the existence and uniqueness 
of an invariant probability measure. In application to our specific context, the sufficient conditions 
are as follows. Suppose there exist a Borel measurable function V : X — s- R+ and constants ^ q < 1 
and r such that the inequality 

EV(f(x, CO)) := / V(f(x, w))R(dw) = f V(y)G(dy | x) < qV(x) + r (3.6) 

J1V JSC 

holds for all x € Here, the expectation E( ) is taken over an P-distributed random element co 
with values in W, and G is the Markov transition kernel (3.2) of the state sequence X under the 
nominal noise. Also, suppose 



sup sup |E(g(/(x > a>))-*C/Xy J a>)))|<2 



(3.7) 
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for any v > 0, where 3£ v := {x £ 3£ : V (x) ^ v} denotes the corresponding sub-level set of the func- 
tion V from (3.6), and the second supremum is taken over Borel measurable real- valued functions g 
on 3£ whose absolute value does not exceed one. The left-hand side of (3.7) is the diameter of the 
set {G(- | x) : x £ in the sense of the total variation distance between probability measures [36]. 
Then, in view of [14, Theorem 3.6 on p. 13], the conditions (3.6) and (3.7) (which correspond to 
[14, Assumptions 3.1 and 3.4 on p. 12]) imply that the system (2.1) has a unique nominal invariant 
state distribution P* . 

The actual conditional distribution Pw\x of the noise may differ from its nominal model (3.1). 
In particular, there can be statistical dependence between Wit's at different times or between the 
noise W and the initial state of the system Xq. Also, the marginal distribution of Wit ma y differ 
from R even if W is indeed a white noise sequence. The discrepancy between the true P\v\x anc ' i ts 
nominal model, present in all these cases, is interpreted as statistical uncertainty in the noise W. 

The dependence of the conditional distribution of the future noise on the current state of the sys- 
tem (which depends on the past history of the noise) can arise in the case of a "colored" noise whose 
values at different moments of time are statistically dependent. Without specifying a mechanism for 
the memory effects in the random environment which produce such a noise 2 , we will interpret the 
conditional probability distributions Pw s -,\x s m (2-9) as the strategy of a hypothetical noise player 
who opposes the control designer. More precisely, it is assumed that the noise player has access to 
the current state X s of the system at any moment of time s and uses this information in generating 
the future noise inputs W Sl W s+ i, . . . so as to make the system deviate from the nominal behavior 
described in Section 3. In particular, this process can be viewed as the noise player aiming to drive 
the actual state distribution P t of the system away from the nominal invariant state distribution P*. 
That is, the noise player aims to drive the state distribution away from the probabilistic equilibrium 
of the system under the nominal noise. The extent, to which the actual probability distribution Px of 
the state sequence differs from the probability law of a Markov chain with the transition kernel (3.2) 
and invariant measure P*, depends on the amount of statistical uncertainty in the noise. 

4. Conditional relative entropy to quantify statistical uncertainty. Similarly to stochastic 
robust control settings such as in [30, 38, 47], the deviation of the conditional noise distribution 
FV|x from its nominal model (3.1) will be quantified in terms of the conditional relative entropy 
[13, Section 5.3]. 

Recall that for two conditional probability distributions P^ and P|, of random elements £ 
and 7] with values in Polish spaces and S^%, the conditional relative entropy of P^ with respect 
t° P£i„ is defined as 

D ( P 5 1 „ 1 1 P| |7) ) : =E In <p (§ | 77 ) = J ^ In q> (x \ y ) P^ (dx x dy ) 
= (j L(p(* I y))P\\ n {Ax I y)) P„ (dy) 
= I Do(P^(-|y)||P| |n (-|y))P,,(dy), <p(x\y) 

where the expectation is taken over the joint probability distribution P^ of % and tj, associated 
with P^jj by the chain rule P^ ^ (dx x dy) = P^ (dx | y)Pj) (dy), and the functional Do is described 
below. Here, the function 

L{p):=plnp (4.2) 



Pg|T,(d*|y) 
P^(d*|y)' 



(4.1) 



2 A discussion of the generation of such noise can be found, for example, in physics literature on open systems [6], 
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is denned on R + , with the standard convention that L(0) = 0. Also, <p : 5?\ x ,5^2 —> M + in (4.1) is a 
Borel measurable function, which, for any fixed but otherwise arbitrary value of its second argument 
y G 5?%, describes the Radon-Nikodym derivative [28, 33, 35] of the probability measure P^ (■ | y) 
with respect to the reference probability measure P^ (• | y), so that (S \ y) = J s (p (x \ y) P|, (dx | 
y) for any Borel subset Sc^i. This conditional probability density function (PDF) <p exists if and 
only if the first measure is absolutely continuous with respect to the second one: 

P^(-|y)<Pi| 1? (-|y)- («) 

That is, for all Borel subsets S d .Y u the fulfillment of P|, (S | y) = implies P^ n (S\y) = 0. The 
functional Do in (4.1), which is distinguished from D, describes the unconditional relative entropy 

D (P||P*)= / ln(p(x)P(dx)= ( L((p(jc))P»(djc), (p(x) := P(dx)/P*(dx), (4.4) 

for probability measures P <C P* on a common Polish space 5? with an appropriate PDF <p : — > 
R + . The conditional relative entropy D(P^|^ ||P|, ) in (4.1) is well-defined if the conditional abso- 
lute continuity (4.3) holds for -almost all values y 6 of the random element 77. It follows from 
the properties of relative entropy [8, 13] that both functionals D and Do are always nonnegative and 
vanish only on equal measures (so that, in particular, Do(P|| P*) = if and only if P = P*). 

Now, when quantifying the deviation of the actual conditional probability distribution Pw sl _i\x s 
of the noise sequence W s:r _i on the time interval [s,t) from its nominal model R'~ s , the conditional 
relative entropy (4.1) takes the form 

E s>t :=Y)(P Ws . i _ l% \\R , - s ) =Eln ( p JV (W, r _ 1 

= / \n<p SJ (w\x)Q s .,(dxxdw) 

= [ (f U<p s ,,(w\x))R'- s (dw))p s (dx) 

JSC \Jw<- s t 

= j t Do;P„.. x.!- I x)\\R'- s )P s (dx), (p s j(w \ x) 

where the expectation is taken over the probability distribution Q s t of the state-noise sequence F v:f _i 
from (2.7) and (2.9). Here, the distribution of the noise sequence W 4:r _i, conditioned on X s , is 
assumed to be absolutely continuous with respect to the corresponding nominal distribution R'~ s in 
the sense that 

P w«-i \x s (' I x) < R'~ s for P s -almost all (4.6) 

This ensures that the conditional PDF <p ( . r : W~ s x — >• M + in (4.5) exists and the quantity E SJ is 
well-defined. Further discussion will be concerned with a class of "admissible" probability distribu- 
tions for the noise as specified below. 

DEFINITION 4.1. A noise sequence W, which drives the system dynamics (2.1), is called ad- 
missible if the conditional probability distribution Pw s .,_i\x s satisfies (4.6) for any times ^ s <t. 

The conditional relative entropy E s>t in (4.5), which is always nonnegative, vanishes for all 
^ s < t if and only if the noise sequence W is 7?°°-distributed and independent of the initial state 
Xq. In what follows, when considering the system on a time interval [s, ?), we will always assume that 
the distribution of the initial state X s is absolutely continuous with respect to the nominal invariant 
state distribution P*. That is, 



p w s »-i\x s {<hv\x) 
R'- S (dw) 



(4.5) 



(4.7) 
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In view of the chain rule (2.9), the fulfillment of conditions (4.6) and (4.7) implies that the actual 
probability distribution Q s t of the state-noise sequence from (2.7) is absolutely continuous with 
respect to the corresponding product measure: 

Q s ,t<P*xR'- s . (4.8) 

Note that (4.8) implies that the property (4.7) will be inherited by the subsequent state distribution 
P t . Indeed, since P, and P* = (P* x R'~ s ) oF t Z s are the image measures of Q s t and P* x R'~ s under 
the same map F t ~ s in view of (2.8) and (3.5), then (4.8) implies that P t <C P*. Therefore, if Pq <C P*, 
then for any admissible noise W in the sense of Definition 4.1, the property P, <C P* holds for any 
t >0. 

Although (4.5) requires only the conditional absolute continuity condition (4.6) for the noise, 
the additional absolute continuity (4.7) for the state distributions will play a role in Section 5. Under 
the conditions (4.6) and (4.7), the chain rule (2.9) allows the PDF of Q s t with respect to the reference 
measure P, x R'~ s in (4.8) to be factorized as 

Q s<t (dxxdw) ;ajj(jt) (w(jt)j , ear, we**-*. (4.9) 



P*(dx)R'- s (dw) 

Here, (p s , t is the conditional PDF of the noise sequence W s:t -i given X s from (4.5), and GJ S : 2£ — > R+ 
is the PDF of the actual state distribution P v with respect to the nominal invariant state distribution 

P*: 

Oj s (x) :=P s (dx)/P*{dx), i€ J. (4.10) 

In what follows, we will study several variational problems which involve the conditional relative 
entropy (4.5). The quantity E sS , which is a measure of deviation from the nominal noise model 

(3.1) , can be regarded as a resource which the noise player would prefer to spend economically in 
performing the role of driving the system away from the nominal invariant state distribution. 

5. Conditional relative entropy balance equation and dissipation inequality. For the pur- 
poses of the subsequent sections, we will now discuss several properties of the conditional rela- 
tive entropy Eoj, defined in (4.5), starting with its decomposition which employs time reversal and 
Bayesian analysis ideas [4]. Let Py Q * Xt denote the conditional (given X t ) probability distribution 
which the state-noise sequence Toir-i would have if the system (2.1) were initialized at the nominal 
invariant state distribution P* and were subjected to the nominal noise W in the sense of Defini- 
tion 3.1 (in which case, the unconditional probability distribution Qo,r of To:r-i would be P* x R'). 
The fact that Fo^-i, associated with the time interval [0,f), is conditioned here on the terminal state 
of the system X t = F t (Yo.t- 1 ) under the nominal noise, with F t the t step state transition map from 

(2.2) , motivates the following definition. 

DEFINITION 5.1. The conditional probability distribution P*^ (■ | x) of a P* x R' -distributed 
random element r\, conditioned on F t (r\) — x, is called the nominal posterior distribution of the 
state-noise sequence Yq X -\. 

Note that the nominal posterior distribution ^ is uniquely determined by the integral 

equation 

g(y,F t (y))(P* x R'Wy) - / ( / £MPy , I *)V*(dx), 

which must be satisfied for Borel measurable functions g : ( 3£ x W l ) x X — > R and is closely related 
to Bayes formula. Here, use is made of the property that the random element F,(rj) in Definition 5.1 
has the nominal invariant state distribution P*. 
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LEMMA 5.2. Suppose the initial state distribution of the system (2.1) satisfies Pq<^P%, and the 
noise W is admissible in the sense of Definition 4.1. Then for any t > 0, the conditional relative 
entropy Eq,, defined by (4.5), is representable as 

^=Do(Pr||PO-Do(ft||^+»(P^iWll p L-il*)- (5A) 

Here, Do is the relative entropy functional (4.4), and Py Q a Xi is the nominal posterior distribution 
of the state-noise sequence Yo t _ \ from Definition 5.1. 

Proof. The factorization (4.9) (see also the chain rale for the relative entropy [13, Lemma 5.3.1 
on p. 94]) implies that 

D (Go, t ||A x R 1 ) = E\n{m(X )(po,,{Wo:r-i | X Q )) 

= ElnGJo(X )+Eln(MWo ;r _i \ x o) 

= D (P \\PJ+E 0>t , (5.2) 

where the expectation is taken over the actual probability distribution Qo, t of the state-noise sequence 
Yq-i-i. Here, % tt is the conditional PDF of Wo :t -i given X from (4.5), and GJo is the PDF (4.10) of 
the initial state distribution Pq with respect to the nominal invariant state distribution P*. Further- 
more, since X t = F t (Yo- t -i) depends in a deterministic way on Fo:r-i in view of (2.2), so that the 
conditional distribution Px,|r f _i (' | y) is an atomic probability measure [33, p. 46] concentrated on 
the singleton {F t (y)} for any y € SC x W 1 regardless of the probability distribution of Fo:r-i> then 
the augmentation of To:r-i by %t does not change the relative entropy in (5.2). More precisely, by 
using a x /^'-distributed random element 77 from Definition 5.1 and applying the relative entropy 
chain rule again, it follows that 

Do(Pr : ( -i^J|P^Kn))= D o( p i'o :t -il|Pt,)+D(P X( | % _ 1 ||P F(W ^) 

= D (Q 0j \\P,xR'). (5.3) 

Here,D(Px,|y 07 _ 1 ||PF I (ij)|rj) =0becausethe conditional probability distributions P^[y , t _j(- \y) and 
Pf,(ti)\ti (' I y) ore identical to each other as discussed above. Now, application of the relative entropy 
chain rule to the left-hand side of (5.3) in the opposite time direction, with io:f-i being conditioned 
on X t , yields 

Do(Pr 0;( _ 1 ,xjP^F t (^))=Do(Px t ||PF (W )+D(P Ko:( _ lK ||P^| J?((rj) ) 

= Do(P,||P*)+D(P y0:f _ 1 | Z ,||P* 0:( _ i|X( ). (5.4) 

Here, use is made of Definition 5.1 of the nominal posterior distribution Py Q and the property 
that F t (rj) is f* -distributed. Also, the absolute continuity P t <C P* is ensured by the assumption 
that Pq <C P* and the admissibility of the noise in the sense of Definition 4.1. By a straightforward 
comparison of (5.2)-(5.4), it follows that 

Do (Go,r 1 1 P. x R') = Do (P j | ) + E 0J = D (P, \ \ P* ) + D( P Yg . t _ , , Xf 1 1 P* to _ ( |X( ) , 

where the second equality is equivalent to the representation (5.1), and the proof of the lemma is 
completed. □ 
The conditional relative entropy £o.r in (4.5) can be interpreted as the supply which the noise 
player has to deliver to the system over the time interval [0, t ) in order to make the state distribution P t 
of the system deviate from the nominal invariant state distribution f*. In view of the relative entropy 
balance equation (5.1), only part of this "expenditure", namely, Do(P ( \\P*) — Do(Po||^*)j contributes 
directly to achieving this goal. The rest of the conditional relative entropy supply Eqj is "dissipated" 
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into D(P Yo . t l \x r II PJ- jx,) which quantifies the amount by which the actual conditional probability 
distribution of the state-noise sequence Fo:r- 1 given X, can be distinguished from the nominal poste- 
rior distribution Py g [ \v ■ This dissipation is caused by an irreversible loss of information contained 
in the state-noise sequence Yq ,_ \ , only a fraction of which is able to be encoded in the terminal state 
X, = /v(io:r-i) °f the system in a bijective way. Omitting the term D(Py . ( _^ Xt l|Py 0( [|x ( ) ^ 0> ^ e 
equality (5.1) implies that 

D (P,\\P*) ^D (Po\\P*)+E ,t. (5.5) 

By analogy with deterministic dissipativity theory [45, pp. 327, 348], the relation (5.5) describes 
a relative entropy dissipation inequality. Accordingly, the state relative entropy Do(P f ||P*), which 
quantifies the deviation of the actual state distribution P t from the nominal invariant state distribution 
P*, plays the role of a storage function at time t. Note, however, that in the stochastic setting under 
consideration, these entropy theoretic functionals do not inherit all the properties of the correspond- 
ing concepts for deterministic dissipative systems. For example, unlike the deterministic integral 
supply which, as a function of an interval of time, is additive [33, p. 23] with respect to the union of 
disjoint time intervals, the conditional relative entropy supply (4.5) is, in general, superadditive as 
described below. 

LEMMA 5.3. For any < s < t, the conditional relative entropy supply Eqj over the time inter- 
val [0,f), defined by (4.5), is not less than the sum of the supplies over the constituent subintervals 
[0,s) and [s,t): 

Eqj ^E 0tS + E s<t . (5.6) 

The inequality (5.6) becomes an equality if and only if three random elements Yq- s -\, X s , W s -. t -\ form 
a Markov chain. 

Proof. The chain rule for joint PDFs with respect to product measures allows the conditional 
PDF (po , in (4.5) to be factorized as 

/ I \ P Wb.,_ip!Ib( dw x • • • x dwr-i I xq) 

R(dwo) x ... xi?(dwf_i) 
_ P w 0:s -i \x (dwo x ... x dny-i | x ) 
R(dwo) x ... x R(dw s -\) 
P W s .,-l\Y Q:s -i( dw >' x ■■■ x d ^f-l \xq,w q ,...,w s -i) 
R(dw s ) x ... x R(dw t -\) 
=(poA w Oi---i w s-i \xo)Vs,t(w s ,...,w t -i \xo,wq,...,w s -i) (5.7) 

for all x G SC and w , . . . , w t -\ € W. Here, \j/ sj : W~ s x f" x W s -> R+ is the conditional PDF 
of the noise sequence W s -j-i, given the state-noise sequence lo:s-i; with respect to the reference 
measure R'~ s : 

, , . Pw I . t _ I |r 0s -i( dw » x --- ><dvt; «-i \xo,w ,...,w s -i) 

\l/ sj (w s ,...,w,-i \x ,WQ,...,W s -l):= — — — — r — r . (5.8) 

R(dw s ) x . . . x R(aw t -i) 

Therefore, substitution of (5.7) and (5.8) into the definition (4.5) of the conditional relative entropy 
£b,r yields 

£ ,,=Eln(po.,(Wo ;r _i |*b) 

= E]n(<po,s(W ;s-i \X )VsAW s .t-i \ Y 0:s -i)) 

= E In (po.,. (Wo-. s - 1 | X ) + Eta Vs , {W s:t - 1 | F 0: ,_ , ) 

= E , + D(P Ws[i _ l]Yo:s _ l \R'- S ). (5.9) 
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The inequality (5.6), which describes the superadditivity of the conditional relative entropy supply, 
can now be obtained by combining (5.9) with 

l*fY. >;,. ||*'-' v ) > D(P Ws . t _ l]Xs \\R'- s )=E s . t . (5.10) 

The last inequality follows from the property that X s = F s (Yo- s -\) depends in a deterministic way on 
To:s-i through a Borel measurable map, whereby the conditioning on To:.v-i is finer than that on X s . 
For a rigorous proof of the inequality in (5.10), we consider the conditional probability distribution 
of 7 :i-i given X s : 

e s (B\x):=P Yo . s _ llXs {B\x), fieIx23J J ,ieJ. (5.11) 

Recall that such posterior distributions of the state-noise sequences were used in Lemma 5.2. In 
view of (2.2), for any given x S 3£ , the probability measure 9 S (- \ x) on (J?f x W,X x 20 4 ') is 
concentrated on the pre-image F S _I (x) = {y £ S£ X W s : F s (y) = x} of the point x under the s step 
state transition map F s in the sense that 9 s (F~ l (x) \ x) = 1. Then the conditional PDF (p s l from (4.5) 
is representable as an appropriate average of the conditional PDF (5.8) over the probability measure 
(5.11) as 

q> Stt {w | x) = E(w(w I Io:,-i) I Xs =x)= [ , | y)9 s (dy \ x) (5.12) 

for all w £ W~ s and iGl". Since the function L in (4.2) is strictly convex, then (5.12) and Jensen's 
inequality [36] imply that 

L((p s>t (w | x)) < E(L(v/ v ,,(w | F 0:s -i)) | X s =x) = [ L(^(w | y))9 s (dy \ x). (5.13) 

Jf-\x) 

Moreover, the inequality in (5.13) becomes an equality if and only if y/ s , t (w \ y) = (p s ,t(w \ x) holds for 
9 S (- | ;c)-almost all y S F~ 1 (x). Hence, the conditional relative entropy supply E s t in (4.5) satisfies 

E s ,t = [J t _H<Ps,t( w I x))R'- s (dw)y s (dx) 

<[ ([ (/. L(v/,,(w|y))0 s (dy|x))^'(dw))p s (cL.-) 
= [ ([ ([ L(y s ,(w\y))#- S (dw))9 s (dy\x))p s (dx) 
= [ (/ L( V /,,(w|y)) J R'- i (dw))eo, J (dy) 

= D ( P w !:f -,|ib,- 1 H^' S )' ( 5 - 14 ) 

which establishes the inequality in (5.10) and completes the proof of (5.6). Turning to the second 
part of the lemma, note that (5.6) becomes an equality if and only if the inequality in (5.14) becomes 
an equality. By the strict convexity of the function L from (4.2), it follows from (5.12) and (5.13) 
that the inequality in (5.14) becomes an equality if and only if 

W(w | y) = <p SJ {w | F s (y)) for Qo.f-almost all (y,w) E ( x W s ) x \ (5.15) 

In view of (4.5) and (5.8), the relation (5.15) holds if and only if Pw,. t _i [Iq-j-i depends on io:.v-i only 
through X s = Fs(Fo : j-i), which is equivalent to the condition that the three random elements Yq :s -i, 
Xs, W S ; t - 1 form a Markov chain. □ 
As can be seen from the above proof, Lemma 5.3 is closely related to the data processing 
inequality and convexity of the relative entropy functional with respect to each of its arguments 
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[8, 13]. The second assertion of Lemma 5.3 shows that the conditional relative entropy supply E st 
is additive as a function of the time interval [s,t) (that is, E s u = E s t +E tu for all ^ s < t < u) 
if only if the noise sequence W is Markov with respect to the state sequence X of the system. The 
Markov property means that, for every time k ^ 0, the probability measures PflMiW-i (' I y) and 
P\v k \x k {' I Fk(y)) on C^j2B) are equal to each other for 2o,A--almost all values y € 5£ x #^ of the 
state-noise sequence Yq-j,_ 1 ■ It turns out that the Markov property is important for noise strategies to 
be economical in the sense of the conditional relative entropy supply. 

6. Markov noise strategies decrease the conditional relative entropy supply. We will now 
introduce a specific change of measure which leads to the Markov property of the noise with respect 
to the state of the system. More precisely, for any s > 0, consider an operator M s which, for any 
t > s, maps the probability distribution Qq, of the state-noise sequence To:r-i> associated with an 
admissible noise W, to another probability distribution Qq, := M s (Qq x ) on the same measurable 
space {3C x f H ,Ix W'- s ) as 



2o,r(dy x dw) = (p s ,(w | F s (y))Qo, s (dy)R'- s {d W ), y g X x W s , w e W~ s . (6.1) 



Here, <p SJ is the conditional PDF of the noise sequence W s: t-\ given X s from (4.5), associated with 
the original distribution Qo ,, and F s is the s step state transition map from (2.2) and (2.3). In order 
to clarify the meaning of (6.1), note that 



in view of the factorization (5.7) and the definition of the conditional PDF \j/ s t in (5.8). Direct 
comparison of (6.1) with (6.2) shows that the action of the operator M s on Q$ t leads to the Markov 
property of the state-noise sequence io:r-i with respect to the intermediate state X s by replacing the 
left-hand side y/ v ,r(w | y) of (5.15) with its right-hand side (p s ,t(w | F s (y)). Therefore, an equivalent 
representation of M s in terms of the conditional PDFs from (4.5) is 

% ;t (wo,...,w t -\ \x Q ) := % :S (wo,...,w s -i \xo)(p s ,t(w s ,...,w t -i \F s (xo,w ,...,w s -i)) (6.3) 

for all xq £ X and Wq, . . . , w t -\ € W , where <po,r corresponds to Qq , in (6. 1), whilst <po.j and (p s j are 
associated with Qq s and Q s t . Under the new measure Qq.,, the random elements Fo :s _i, X s , W s:t -\ 
form a Markov chain. The operator M s is idempotent (that is, := M s oM s = M v ), since those (and 
only those) probability distributions Qqj, which are already Markov with respect to X s , are invariant 
under M s . 

LEMMA 6.1. For any < s < t, the operator M s , acting on the probability distribution Qq , in 
(6.2) as described by (6.1), leaves the probability distributions Qq s and Q s ., and the state distribu- 
tions Pq, . . . ,P t unchanged. The conditional relative entropy supply Eqj on the time interval [0,f ), 
associated with the new measure Qq ,, satisfies 



The inequality in (6.4) becomes an equality if and only if the measure Qq , is invariant under M s , 
that is, if and only if the three random elements Yq :s -i, X s , W s: t-i form a Markov chain. 

Proof. Throughout the proof, the probability distributions and other quantities associated with 
the new measure Qq., will be marked by the "hat" symbol. The property that the operator M s pre- 
serves the probability distribution of Yo:.v- 1 , 



Q 0j (dy x dw) = yr SJt (w | y)Qo. s (dy)R r ~ s '(dw) 



(6.2) 



(6.4) 



Qo.s = Qo,s, 



(6.5) 
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is verified by using an appropriate marginal distribution obtained from Qo.r y i a integrating both sides 
of (6.1) over w £ W~ s , since JW<- S <Ps,r (w | x)R t ~ s (dw) — 1 for any x € S£ . Hence, the conditional 
relative entropy supply Eq.s over the time interval [0,s), which is completely specified by Qo :S , re- 
mains unaffected: 

E 0;S =E 0;S . (6.6) 

Furthermore, (6.5) implies that the state distributions Pq, . . . ,P S of the system up until time s are also 
preserved: 

P k = P k , k = 0,...,s. (6.7) 

It follows from (6.1) that M s also preserves the conditional probability distribution Pw„_ { \x s ■ Hence, 
the conditional PDF (p s j from (4.5) is also preserved: (p s t = (p s j. This property, combined with the 
equality P s = P s from (6.7), yields 

Q s j{dx x dw) = f S;t (w | x)P s (djc)R'- s (dw) 
= (p s ,(w\x)P s (dx)R'- s (dw) 

= Q SJ (dxxdw), xeJT, weW f ~ s . (6.8) 

Therefore, the conditional relative entropy supply over the time interval [s,t) is also invariant under 
the action of M s : 

E s , t =E s , t . (6.9) 

Furthermore, (6.8) implies the invariance of the corresponding state distributions P s , ... ,P t of the 
system: 

P k = P k , k = s,...,t. (6.10) 

Therefore, in view of (6.7) and (6.10), all the state distributions Pq, . . . ,P t of the system on the time 
interval [0, t ] are invariant under M s . Since Yq. s - i , X s , W s:r _ i form a Markov chain with respect to the 
new measure Qo, t , then a combination of Lemma 5.3 with (6.6) and (6.9) yields 

Eqj =Eq, s + E sJ =Eo,s + E sj ^E , t , (6.11) 

which proves (6.4). Using the second assertion of Lemma 5.3, it follows that the inequality in 
(6.1 1) is an equality if and only if io : . v -i, X s , W y . t -i form a Markov chain under the original measure 
Qo tt . Therefore, to prove the last assertion of Lemma 6.1, it remains only to recall the equivalence 
between the Markov property of the probability measure Qq , and its invariance M s (Qo t ) = <2o,? 
under the operator M s defined in (6. 1 ). □ 
Using Lemma 6.1, it follows that the application of the operator M s strictly decreases the con- 
ditional relative entropy supply over the time interval [0,?), thereby leading to a more economical 
strategy for the noise player, unless the original noise strategy is already Markov with respect to X s . 
From (6.1), it follows that the operators M s and M u commute for any times s < u, and their compo- 
sition M s o M u = M u o M s leads to the Markov property of the noise W with respect to the states X s 
and X u . More generally, the operator 

M w _i :=Mio...oM M (6.12) 

leads to the Markov property of the noise W with respect to the intermediate states X\ , . . . ,X t -\ . The 
resulting probability distribution Qo.r := M\ r-i (<2o,r) of the state-noise sequence Fo^-i, whose 
conditional PDF is given by 

r-l 

<Piv(wo,...,w r _i |jco) :=n^+ 1 ( Ws \F s {xq,wq 1 ...,w s -\)), 

i=0 
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similarly to (6.3), inherits the distributions Qk,k+\ = Px k W k from Qq, for all k = 0,...,t — 1. Fur- 
thermore, the conditional relative entropy supply Eq,,, associated with Qq,,, is additive on the time 
interval [0,?) in the sense of the equalities 

£ v , !( = E k.k+\ ^ E s,u, < s < u ^ t . 

k=s 

The fact that the operator M\ , f _ i : Qq, ^ Qo.t m (6.12) decreases the conditional relative entropy 
supply, while preserving the state distributions of the system, implies that a non-Markov noise strat- 
egy, which drives the system along a given sequence of state distributions Pq, . .. ,P t , can always be 
made more economical by replacing Qq, with the Markov strategy Qq,. 

7. Bellman equation for the minimum required conditional relative entropy supply. Con- 
sider the problem of minimizing the conditional relative entropy supply Eqj in (4.5) required to drive 
the system (2.1) from a given initial state distribution <t> to a given terminal state distribution *P over 
a time interval of specified length t > 0: 

J,(^):=M{E 0J : P Q = *, P Woj _ l{Xo <R f , P, =W}. (7.1) 

Here, both probability measures <t> and *P on ( JT,X) are assumed to be absolutely continuous with 
respect to the nominal invariant state distribution P„, and the infimum is taken over those admissible 
noise strategies Pw ,_i|x m tne sense of Definition 4.1, under which the state distribution of the 
system evolves from Pq = <t> to P, = *P. Variational problems like (7.1), which involve entropy and 
probabilistic boundary conditions, are known as Schrodinger bridge problems [2, 29] and are usually 
treated in the context of reciprocal processes, that is, Markov random fields on the time axis [17]; 
see also [1, 5, 9, 25, 43] for continuous time formulations. 

If the system is initialized at a -distributed state Xq, then application of a nominal noise with 
Pw\Xq = R°° ( so mat Eo,t = 0) leaves the state distribution unchanged, and hence, 

J,(P„P,)=0 (7.2) 

for any time horizon t > 0. However, if *F ^ P*, then /,(P >H , V F) is positive and quantifies the cost for 
the noise player to drive the system from to *P in time t. The larger 1 / ( (P H ,, V F) is, the more robust 
the system is with respect to the uncertain noise. Minimization of Eq t on the right-hand side of the 
dissipation inequality (5.5) under the constraints Pq = <t> and P t = *P yields a lower bound 

/,(*, ^)>max(D ( , J'||P*)-Do(«5||/ , *),0), (7.3) 

which also clarifies the role of the assumptions <t> <C P* and *P <C P* for the well-posedness of the 
problem (7.1). However, these absolute continuity conditions are, in general, not enough to guar- 
antee finiteness of the quantity / r (<t>, v F) since the discrete-time system (2.1) may lack reachability 
with respect to the noise over short time intervals. 

DEFINITION 7.1. A terminal state distribution f<P, is said to be reachable from an initial 
state distribution <E> -C P* in time t > if the minimum required conditional relative entropy supply 
J,^,^) in (7.1) is finite. 

The following theorem shows that the additivity of the conditional relative entropy supply for 
Markov noise strategies, established in Lemma 5.3, plays an important role in determining the min- 
imum required supply in (7.1). 

THEOREM 7.2. For any time horizon t > 0, intermediate time < s <t and initial and terminal 
state distributions <J> and *P, the minimum required conditional relative entropy supply (7.1) satisfies 



= inf (/,(*, 0) +/,_,(©,¥)) , 



(7.4) 
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where the infimum is taken over all intermediate state distributions reachable from <t> in time s 
and for which *F is reachable from in time t — s. Furthermore, if the infimum in (7.1) is achieved, 
then every optimal noise strategy is Markov with respect to the state of the system. 

Proof. By using an intermediate state distribution ©, it follows that the infimum in (7.1) can be 
represented as 

/,(*, V)=MJ Stt (®,®,W), (7.5) 

where 



J s<t (<Z>,®,y) :=inf{£ ,r : P^ft Po = ®, P s = ®, P t = ¥} (7.6) 

involves an additional constraint P s = 0. Application of the superadditivity (5.6) of the conditional 
relative entropy supply to (7.6) yields 

> /, (<&,©)+ j t - s (®y). (7.7) 

We will now prove that the inequality (7.7) is, in fact, an equality from which (7.4) follows imme- 
diately in view of (7.5). Suppose the probability distributions Qq s and Q s t are associated with an 
admissible noise on the subintervals [0,s) and [s,t) satisfying Pq = <t>, P s = ©, P t = W. Note that 
<2o,s an d Qs,t are compatible since they ascribe to X s the same probability distribution 0. Hence, 
there exists a probability distribution go.r which is Markov with respect to the intermediate state X s 
and leads to the marginal distributions Qo, s and Q s t described above. The corresponding conditional 
PDF ipo.r : x 3& — > from (4.5) is expressed in terms of (po s and <p s t , associated with Qq s 
and Q st , as described by (6.3). In addition to P s = 0, the measure go.r also satisfies the boundary 
conditions Po = <t> and P t = *P for the state distribution. By Lemma 5.3, the Markov property of go.r 
implies that the conditional relative entropy supply satisfies 

E 0tt =E ,s+E Stt . (7.8) 



For any e > 0, each of the measures Qq s and Q s t can be chosen so that the corresponding conditional 
relative entropy supply is e-close to its minimal value in (7.1): 

E ,s < /,(*, 0) + e, E tt ^ J t -s (0,^) + e. (7.9) 



By combining (7.8) and (7.9), it follows that 

E j ^J s (*,«)+ 7,-.v(0, l I , )+2e . 

Therefore, by combining the suboptimal noise strategies Qq s and Q s t into the Markov strategy Qq , 
as described by (6.3), the total conditional relative entropy supply Eo,t can be made arbitrarily close 
to the right-hand side of the inequality (7.7). This implies that (7.7) holds as an equality, thus proving 
(7.4) in view of (7.5). We now proceed to the proof of the second assertion of the theorem which 
assumes that the infimum in (7.1) is achievable. Let Qq , be an optimal noise strategy which leads to 
the minimum conditional relative entropy supply £b.r = Jt{^^)- Suppose go.r is not Markov with 
respect to the state signal X of the system on the time interval [0, t ). Then application of the operator 

(6.12) generates a different measure Qo, t := M\ t-\{Qo.t) ^ Qo.t- By Lemma 6.1, Qo.t satisfies the 

boundary conditions Po = < t > and Pt =x V for the state distribution and delivers a smaller conditional 
relative entropy supply Eqj < Eot- The latter, however, contradicts the optimality of Qq ,. This 
contradiction implies the Markov property of go.r- □ 
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Special cases of Theorem 7.2, which are obtained by letting s = lors = t — lin (7.4), lead to 
a dynamic programming Bellman equation [19, pp. 319-320] for the minimum conditional relative 
entropy supply in (7.1): 

/,+ 1 (<J>, ¥) = inf (7! 0) + /,(©,¥)) = inf (J t (<&, 0) +/ x (0, ¥)), (7.10) 

Each of these equalities is a recurrence equation whose right-hand side is an operator acting on the 
functional J,. In particular, the minimum supply / f (P*, , P), required to drive the system from the 
nominal invariant state distribution P* to a different state distribution *P in time f, is nonincreasing in 
f. Indeed, (7.10) implies that 

in view of (7.2). Here, /{(P*, V P) is analogous to the required supply in the sense of [45, Definition 5 
on p. 329]. A similar monotonicity condition holds for / f (<J>,P*), which quantifies the cost for the 
noise player to drive the state distribution of the system from <t> to P* in time t. Another representa- 
tion of (7.4) in a form, known in the Russian optimization literature as the "Kiev broom", "walking 
tube" or "local variation" method (see, for example, [26]), is 

r-1 

J t (t>,V)= inf £/i(JWi), (7-11) 

Pl,;P,-l k=0 

where the infimum is taken over appropriately reachable intermediate state distributions P[, . . . ,P t -\, 
with Pq = <t> and P, = VP. The sum on the right-hand side of (7.11) is the minimum conditional 
relative entropy supply over the time interval [0,f) required to drive the system along a specified 
sequence of state distributions Pq,Pi, . . . ,P f _i,P(. In this state distribution tracking problem, any 
optimal noise strategy is Markov with respect to the state X of the system. This can be verified by 
the argument, employed in the proof of Theorem 7.2, that application of the operator (6.12) leads to 
a more economical Markov noise strategy. In particular, the minimum conditional relative entropy 
supply rate per time step, required to maintain the system in a fixed state distribution <t> (reachable 
from itself in one step), is 

r 1 inf E f= -A (*,*)• ( 7 - 12 ) 

Qoy.P =P 1 =...=P,=<£ ' 

The fact that (7.12) holds not only in the infinite-horizon limit t — s- +°° but also for any t > 0, is 
closely related to the additivity of the conditional relative entropy supply for Markov noise strategies 
discussed in Lemma 5.3. The quantity J\ ( < t ) , < t ) ) is positive if the state distribution <t> is not invariant 
under the nominal noise. In this case, in order to maintain the system in <t>, the noise player has to 
persistently deviate from the nominal noise model (3.1). Indeed, any optimal noise strategy in (7.12) 
is Markov with respect to the state of the system and is completely specified by the conditional 
probability distributions P^ix^, k = 0, . . . ,t — 1. These distributions can be made identical to Pw |z 
which delivers the minimum value J\ (<P, <S>) in the problem (7.1) with t = 1 and VP = <1>. The resulting 
state sequence X is a homogeneous Markov chain with an invariant measure <t> and a transition 
kernel which is different from G in (3.2). Suppose the loss in system performance, associated with 
<t> being different from the nominal invariant state distribution P*, is quantified by a real- valued 
functional E(P*,<E>). For example, the loss functional E(P*,<E>) can describe the undesirable increase 
in a moment Eg(X&) = g(x)<t>(dx) of the system variables (specified by a function g : S£ — > R+) 
over a steady-state distribution <t> in comparison to the nominal value J^; g(x)P* (dx) of this moment 
under P*. Then the nonnegative quantity 



(7.13) 
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is the minimum cost for the noise player (in terms of the conditional relative entropy supply rate) 
to achieve a given level y of the system performance loss. Therefore, Z(y) can be interpreted as a 
robustness index for the system: the larger Z(y) is, the more robust the system is with respect to the 
uncertain noise. A practically computable version of the robustness index Z(y) in (7.13), associated 
with the second moments of state variables, will be considered in Section 1 1 for a class of linear 
systems. 

8. Reachability of Gaussian state distributions in linear systems. We will now specialize 
the results of the previous sections to linear systems with the state space SC := M", input space 
W := K'", and dynamics (2.1) described by 

X k+1 =AX k +BW k . (8.1) 

Here, A £ M. nxn , B £ R" x ™ are given matrices, and A is assumed to be asymptotically stable (that 
is, its spectral radius satisfies p(A) < 1). Unless specified otherwise, vectors are assumed to be 
organized as columns. Also, suppose the nominal marginal distribution R of the noise is the Tri- 
dimensional standard normal distribution with zero mean and identity covariance matrix: 

R:=JS{0,I m ). (8.2) 

Then the corresponding nominal invariant state distribution of the linear system (8.1) is also 
Gaussian, 

n = ^(o,r), (8.3) 

where the covariance matrix Y coincides with the infinite-horizon reachability Gramian of the pair 
(A,B) and satisfies an algebraic Lyapunov equation: 

T= £ A k BB T (A k ) T = AFA T +BB T . (8.4) 

k=0 

The following theorem extends the condition for linear system reachability [18, 19] from signals to 
probability distributions and is valid regardless of the asymptotic stability of the matrix A. Let 

r-1 

T, := Y,A k BB T (A k ) T = H t Hj (8.5) 

k=0 

denote the reachability Gramian of the system (8.1) for a finite time horizon t > 0, where H t £ R" xmI 
is an auxiliary matrix defined by 

H t :=[A'- l B A'- 2 B ... AB B] . (8.6) 

THEOREM 8.1. Suppose the linear system (8.1) is endowed with the nominal marginal distri- 
bution (8.2) of the noise, and let 

4>:=.yK(a,E), ^ := jV (J3,0), (8.7) 

be any Gaussian distributions with covariance matrices £ >- and 0^0. Then the state distribution 
of the system can be driven from Pq = <t> to P t = *F by an admissible noise within any given time 
horizon t^n if and only if(A,B) is reachable. Moreover, this can be carried out by using a Gaussian 
noise sequence Wo :t - 1 with the conditional distribution 



p w 0: ,-, \x a = ^[hJt; 1 (J3 - A' a + ( V© - er, XT 1 / 2 -A')(X - a)) , el m ). (8.8) 
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Here, T t is the t step reachability Gramian with the associated matrix H, from (8.5), (8.6), and £ is 
a real parameter satisfying 



o<£ s$ i/pir,®- 1 ). 



(8.9) 



Proof. For the linear dynamics (8.1) being considered, the t step state transition map takes the 



form 



(-1 



X, = A% + £ A'- s - l BW s = A'X Q + H t Wto-i 

s=0 



(8.10) 



Here, the state-noise sequence io:r— l and the noise sequence Wo : ,_i, defined by (2.4) and (2.5), are 
organized as the vectors 



Yo,- 



X 



Wo. t - 



(8.11) 



of dimensions n + mt and mt, respectively. Accordingly, the matrix F t , which describes the linear 
state transition map Fo :t _i M> X t in (8.10), is associated with H t from (8.6) by 



F,:=[A' H t ] = [A t A'- l B ... AB B] 



(8.12) 



The linearity of the system (8.1) allows the first two moments of X t to be related to those of the 
state-noise sequence Fo:r-i as 



EX t = F t EY Q ..,_ 1 =A t EX +H t EW :t-i, 
co\(X t ) = F t coy(Y Q -,-i)F? 

= cov(A%+//,E(Wo:,-i \X ))+H t Ecov(Wo:,-i \ X )H?. 



(8.13) 



(8.14) 



Here, cov(^,tj) := E(^tj t ) — E^Etj t denotes the covariance matrix of square integrable random 
vectors £ and 77, with cov(£) := cov(<^ ), and cov(£ | Q :=E(^ T | Q-E(% \ ?)E(| | T is the 
conditional covariance matrix of t, given another random vector £. Also, use is made of the "total 
covariance" identity cov(^) =cov(E(<i; | Q)+Ecov(^ | Q; cf. [36, Remark 4 on p. 214; Problem 2 
on p. 83]. Now, suppose the time horizon t is fixed and satisfies t ^ «, being otherwise arbitrary. 
We will construct an admissible noise sequence Wo :r - 1 , which is jointly Gaussian with Xo and drives 
the system (8.1) between the Gaussian state distributions Pq = <t> and P t = *P in (8.7) with arbitrary 
mean values a, /3 and nonsingular covariance matrices E, ®. Since t ^ «, the reachability of (A,B) 
is equivalent to the positive definiteness of F t , the reachability Gramian in (8.5), which is equivalent 
to the matrix H t in (8.6) being of full row rank. By substituting the initial and terminal state mean 
values EXq := a and EX t := j3 into (8.13), it follows that the noise sequence Wo :r -i must satisfy 



/3 =A'a+H t EW :t-i 



(8.15) 



This equality can be fulfilled, for example, by using the following particular mean values for the 
noise sequence 



EWb :f _i —H^iP-A'a) 



(8.16) 



The relation (8.15), which does not suppose the distribution of Fo:r-i to be Gaussian, shows that at 
the level of first moments, the reachability of (A,B) is not only sufficient but is also necessary for the 
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reachability of state distributions. Indeed, if (A,B) is not reachable, then the image imH t := {H t w : 
w £ M.""} of W under the linear map specified by the matrix H, is a proper subspace of the system 
state space W. In this case, (8.15) can not be satisfied if, for example, a = and j3 ^ imH t , thus 
proving the necessity. We will now consider the second moments. By substituting the initial and 
terminal state covariance matrices co\(Xq) := E and co\(X t ) := © from (8.7) into (8.14), it follows 
that the Gaussian noise sequence Wo-.t-\ being constructed must also satisfy 

© = (A' +H,K)Y.{A' +H,K) T + H t LHj, (8.17) 

where the matrices 

K:=coy(W -,-uXo)L- 1 , L — coyiWo-j-t) -co\(W :t-iMZ~ 1 cm(X ,Wo-.t-i), (8.18) 

together with E, parametrize the covariance matrix of the state-noise sequence To:f-i computed in 
accordance with (8. 1 1) as 



cov(y 0:f _i) 



E ZK 1 
KL KLK T + L 



(8.19) 



Since Xq and Wo :r _i are jointly Gaussian by construction, the matrix L in (8.18) coincides with the 
conditional covariance matrix cov(Wo :f _i | Xq) which does not depend on the conditioning random 
vector Xq in the Gaussian case. For a Gaussian state-noise sequence To:r- 1 , the admissibility of the 
noise, that is, the conditional absolute continuity of Pw ,_ 1 |x m tne sense of (4.6), is equivalent to 
L y 0. The covariance condition (8.17) is satisfied, for example, if the matrices (8.18) are chosen as 

K = HjT; 1 ( V© - sT t IT 1 12 - A') , L = el mt . (8.20) 

Here, £ is a positive parameter small enough to ensure the positive semi-definiteness of — er, for 
the real matrix square root to be well-defined, which is equivalent to (8.9). Thus, a Gaussian noise 
sequence Wo :r _ i with the conditional distribution 

P w 0: ,-i|Xo =^(E(Wo:,-i \Xo),L) t E(W 0:f -i \X Q ) =EW 0:r _! +K(X -a) (8.21) 

whose parameters are given by (8.16) and (8.20), indeed drives the state distribution of the system 
from Pq = <t> to Pi = as specified in (8.7). Here, use is made of well-known results on conditional 
distributions for jointly Gaussian random vectors [21, 36]. Now, (8.8) is obtained by substitution of 
(8.16) and (8.20) into (8.21). □ 
Remark. It follows from Theorem 8.1, that the noise player can drive the linear system (8.1) 
with a reachable pair (A,B) between arbitrary nonsingular Gaussian state distributions by using 
Gaussian noise sequences, provided the time horizon t is not smaller than the state dimension n. The 
latter condition can be relaxed to t ^ T, where 

T:=min{f>0: T t >~ 0} (8.22) 

is the first time when the matrix H, in (8.6) acquires full row rank. For example, if n ^ m and 
rankS = «, then t = 1 . A 

Although the specific choice of a noise sequence which was made in the proof of Theorem 8.1 
is not unique, it turns out that the class of Gaussian noise strategies is large enough to contain an 
optimal strategy for the problem (7.1) with Gaussian boundary conditions <t> and *F, so that more 
general (non-Gaussian) noise strategies are not superior in this case. 
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9. Computing the minimum conditional relative entropy supply for linear systems. The 

significance of Gaussian noise sequences for minimizing the conditional relative entropy supply in 
the case of linear dynamics (8. 1 ) is clarified by the following lemma. This lemma, which is provided 
for the sake of completeness, is an adaptation to the present case of the well-known results, which are 
closely related to the maximum entropy principle [8, 22]; see also, [32, Lemma 4 on pp. 313-314]. 

LEMMA 9.1. Suppose t, is a square integrable W -valued random vector with an absolutely 
continuous probability distribution. Then its relative entropy (4.4) with respect to the r-dimensional 
Gaussian distribution ^V(a,C) with mean a <= W and covariance matrix C >~ satisfies 

D (Pg||^(q,C)) > i(||E4-fl|||_,+ Tr^-lndet^-r ), % := (rW^), (9.1) 

"covariance" part 

where \\v\\m '■— V v T Mv denotes the Euclidean norm generated by a real positive definite symmetric 
matrixM. Furthermore, the inequality (9.1) becomes an equality if and only if the distribution P<t is 
Gaussian. 

The nonnegativeness of the covariance part of the right-hand side of (9.1) follows directly from 
the positive definiteness of C and cov(^ ), whereby the eigenvalues Ai , . . . , A r of the matrix % are all 
real and positive [16, Theorem 7.6.3 on p. 465]: 

Tr* - lndet* - r = £ (A* - In A, - 1) > 0. (9.2) 

k=l 

This quantity vanishes if and only if C = cov(^), since min^ >0 (A — InA) = 1 is achieved only at 
A = 1 . The following theorem provides a solution to the optimization problem (7.1) with Gaussian 
boundary conditions. 

THEOREM 9.2. Suppose the linear system (8.1) has a reachable pair (A,B), and the matrix 
A is asymptotically stable. Then for any time horizon t ^ n and any initial and terminal Gaussian 
state distributions 4> and *F in (8.7) with nonsingular covariance matrices, the minimum required 
conditional relative entropy supply (7.1) is computed as 

/r(*,V) = (||/3 -A'a|£ , + Ti{U + V- y/l„ +4UV) -IndetO) jl. (9.3) 

Here, 

u -.= rr 1/2 A'E(A') T rr 1/2 , v := ry 1/2 ©ry 1/2 (9.4) 

are real positive semi-definite symmetric matrices (with V >- 0) defined using (8.5), and 13 is a real 
positive definite symmetric matrix of order n satisfying the algebraic Riccati equation 

U + UUU = V. (9.5) 



Proof. Suppose the system under consideration is initialized at the state distribution Pq = <t>. 
Then, in view of (5.2), the conditional relative entropy supply (4.5) over the time interval [0,f ) takes 
the form 



Eos =D (Go,f||ft x/0-Do(*||n), 



(9.6) 



where, as before, Qo, t is the probability distribution of the state-noise sequence lo:r- 1 ■ In order to 
ensure the terminal condition P t = l P, the moments ETo:r-i and cov(rb:?-i) must satisfy (8.13) and 



(8.14). In view of (8.2) and (8.3), the probability measure x R r = JY[0 



is a Gaussian 
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distribution in M." +,nt whose covariance matrix is nonsingular by the reachability of (A,B). Hence, 
Lemma 9.1 implies that the minimum of Eqj in (9.6) with respect to Qq , with fixed EYo^-i and 
cov(l / o:(-i) is achieved at the Gaussian distribution t yF(EFo :( _i,cov(Fo :( _i)). Also, by Theorem 8.1, 
for Gaussian initial and terminal state distributions (8.7), there exist Gaussian noise sequences which 
drive the system from Pq = <t> to P t = *P. Therefore, consideration can be restricted to Gaussian 
state-noise sequences, so that Lemma 9.1 reduces the computation of /((O, 1 ?) to the constrained 
minimization of the conditional relative entropy 

£o,,=D(P Wo: ,_ 1 | Zo ||^(0,/ m ,)) 

= E (|E(W 0: ,-i |X )| 2 + TrL - IndetL - mt) jl 

= (|EWo : ,-i| 2 + Tr(/nX T + L)-lndetL-TOf)/2. (9.7) 

Here, use is made of the property that if the state-noise sequence To:r-i is Gaussian with covariance 
matrix (8.19), then the conditional distribution Pw ,_ 1 |x is given by (8.21), and hence, 

E(|E(W 0:; _, |X )| 2 ) = |EWo,_i| 2 +Tr(£E* T ). 

The right-hand side of (9.7) is to be minimized over the mean EWo^— i subject to (8.15) and over the 
matrices K and L from (8.18) and (8.19) subject to the covariance condition (8.17). The constrained 
minimization of (9.7) over EWo :r _i subject to (8.15) can be "decoupled" from the minimization with 
respect to K and L. By applying the linearly constrained least squares method and recalling (8.12) 
and (8.5), it follows that 

min |EW 0: ,-i| 2 = ||j3-A'a|| 2 ,. (9.8) 

EW 0: ,_i satisfying (8.15) l t 

Here, the minimum is achieved at EWo*-i> described by (8.16), which can be represented in a 
step-wise form as EW k = B T (A t - l - k ) T r~ l (J3 -A' a) for k = 0, . . . ,t - 1. This can be obtained by 
solving a linear-quadratic optimal control problem [18, 19] of minimizing the function £ti | | 2 
for the dynamical system EX k+ \ = AEX/ ( + Bu^ with respect to := EW^ subject to the boundary 
conditions EXq — a and EX t = j3. The latter system results from averaging the linear dynamics 
(8.1). We will now minimize the remaining part 

Tr (KLK T +L)- IndetL (9.9) 

of (9.7) with respect to the matrices K and L subject to the covariance constraint (8.17). Since E y 0, 
and IndetL is strictly concave in L ^ [16, Theorem 7.6.7 on p. 466], the function in (9.9) is strictly 
convex in K and L. The structure of the constraint (8.17) allows corresponding Lagrange multipliers 
to be assembled in a real symmetric (n x «)-matrix N, so that the Lagrange function for minimizing 
(9.9) subject to the constraint (8.17) is 

A(K,L) :=Tr(K'LK T +L) - IndetL 

- Tr (N((A' + H t K]L{A' + H t K) T + H t LHj)) . (9.10) 

Here, the last trace is the Frobenius inner product [16] of the matrix and a real symmetric matrix 
on the right-hand side of (8.17). The equations for the Frechet derivatives of A (with respect to the 
matrices K and L) to vanish are 

d K A(K,L) = 2((I m - HjNH t )K - hJnA')Z = 0, (9. 1 1) 

d L A(K,L)=I mt -H?NH,-L- 1 =0, (9.12) 
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where use is made of the Frechet derivative In detL — L .By solving (9.12) for L and substituting 
the result into (9.1 1), it follows that the stationary point of the Lagrange function (9.10) is described 
by 

K = LHjNA' , L = (/„„ - HjNH, Y x . (9.13) 

Since the reachability Gramian in (8.5) satisfies r, y for t n, the matrix inversion lemma [16, 
pp. 18-19] yields 

S := H t LHj = H,{I mt +Hj{I n -NH t Hj)- l NH t )Hj 

= r t +r t {i„-Nr t )- l Nr t = (rr 1 -AO -1 , (9.14) 

H,K = SNA' = (ST- 1 -7„)A f . (9.15) 

Hence, the covariance relation (8.17) takes the form of an algebraic Riccati equation in the matrix S: 

© = (/„ + SN)A"L(A : ) T (/„ + NS) + S 

= Sr- 1 A"L(A') T r- 1 S + S. (9.16) 

Since H t is of full row rank and L >~ 0, then (9.14) implies that S >- 0. In view of (9.4), left and right 

-1/2 

multiplication of both sides of (9.16) by T, ' leads to an equivalent Riccati equation (9.5) in the 
real positive definite symmetric matrix 

u-.= r; 1/2 sr; l/2 . (9.17) 

Since U > and V X 0, the Riccati equation (9.5) has a unique solution 15 X 0; see, for example, [20]. 
We will now express the minimum value of the function (9.9) in terms of 15. Recall that for con- 
forming matrices C and D, the matrices CD and DC share nonzero eigenvalues [16, Theorem 1.3.20 
on p. 53]. Hence, by changing the order in which Hj and NH t are multiplied in the representation of 
the matrix L in (9.13) and using (8.5) and (9.14), it follows that the spectrum of L differs from that 
of 

{i mt -NH t Hj)- 1 = (i n -NT t )- 1 = r; l s 

only by ones. Since spectra are invariant under similarity transformations [16], the eigenvalues of 
ry'S = r, 1 ' 2 0\/r7 are identical to those of 15 in (9.17). Therefore, 

detL = detO, TrL = TrO + mt - n. (9.18) 

Furthermore, (9.11) and (9.15) imply that K = HJN{A' + H,K) = HjNSTy X A\ and hence, 

Tv(KZK T ) = Ti:{H?NSr- 1 A'I.(A') T r- 1 SNH t ) 

= Tr(r,/vsrr 1/2 t/r,~ 1/2 s/v) 

= Tr{y/r t Ny/r,mUy/r t Ny/r t )=Tr(AUA). (9.19) 

Here, 

a := vrvv vTJxs = y/r t (17 1 - s~ 1 ) ^/f,i5 = u-i„ (9.20) 

is a real symmetric matrix associated with (9.17), and use has been made of (9.14) which implies 
that /V = Tr 1 — S~'. Now, by combining (9.20) with the Riccati equation (9.5), it follows that 



AUA = U -UU-15U + UUU = U -UU-15U + V -15, 
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and hence, (9.19) becomes 

Tr(KI.K T ) =Tr(I/ + V-2f/l5-0). (9.21) 

Furthermore, Lemma 10.1, which will be established in Section 10 independently of the current 
proof, implies that 

2UU = AUV (I n + y/l n + AUV) ~ 1 = y/T„ + AUV - 1„ . (9.22) 

It now follows from (9.21), (9.22) and (9.18) that the minimum value of the function (9.9) is com- 
puted as 

min (Tr (KLK T +L)- lndeti) 

K,L satisfying (8.17) 

= Tr (17 + V - 2U13) - IndetO + mt - n 

= Tr (17 + V - y/l n + AUV) - lndet U + mt. (9.23) 

Finally, (9.3) is obtained by substituting (9.8) and (9.23) into the right-hand side of (9.7). □ 
A closed-form solution of the Riccati equation (9.5) will be provided in Section 10. The proof 

— 1/2 

of Theorem 9.2 shows that 13 = cov(r ; ' X t \ Xq), is the conditional covariance matrix of the "bal- 

-1/2 

anced" terminal state T t X t of the system under the optimal noise strategy on the time interval 
[0,f) which delivers the minimum value /((<!>, 'I') in the problem (7.1). The corresponding cross- 

— \/2 1/2 

covariance matrix of the initial and balanced terminal states is cov(r, ' X t7 Xo) = l3T t ' A E. Sim- 
ilarly to the inequality (9.2), the "covariance" part of the right-hand side of (9.3) is always nonnega- 
tive: Tr (17 + V - s/T n + AUV ) - In det 13 = Tr (A + AUA) - In det(/„ + A) > TrA-lndet(/„ +A) ^0 
in view of (9.20). It only vanishes if the solution of the Riccati equation (9.5) is 13 = /„, or equiva- 
lently, if the matrices (9.4) satisfy V = I„ + U. The latter equality holds if and only if the initial and 
terminal state covariance matrices E and © in (8.7) are related by the Lyapunov equation 

©=A'£(A') T +r,. 

The right-hand of this equation, as a function of time t, describes the evolution of the state covari- 
ance matrix co\(X t ) which the linear system (8.1) would have under the nominal noise, provided 
cov(Xo) = E. Furthermore, as t — >• +°°, the minimum conditional relative entropy supply required 
to drive the system to the terminal state distribution *P = jV (j5 , ©) ceases to depend on the initial 
state distribution <t> from (8.7) and approaches the relative entropy of 1 ? with respect to the nominal 
invariant state distribution in (8.3), 

f Kmjt (*, = ( 1 1 J3 1 1 1 _ , + Tr (r~ 1 0) - In det (r~ 1 ) - n) /2 = D 1 1 P, ) , (9.24) 

where T = Iim r ^ +O o T t is given by (8.4). This can be obtained from (9.3), since p (A) < 1 implies that 
the matrix U in (9.4) vanishes asymptotically, while both V and the solution 13 of the Riccati equation 
converge to r _1 / 2 ©r -1 / 2 . Since the infinite-horizon limit of /,(<!>, l F) in (9.24) is independent of <t>, 
it could not be less than Do( x P||P H .), in view of the lower bound (7.3). 

Remark. In view of Lemma 9.1, the proof of Theorem 9.2 shows that the right-hand side of 
the equality (9.3), which is computed in terms of the first two moments a, E and p\ of the initial 
and terminal state distributions <t> and *P, remains valid as a lower bound for /,(<!>, ^P) if <t> or *P are 
not Gaussian. A 
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10. Closed-form solution of the Riccati equation. The following lemma provides an explicit 
solution to the Riccati equation (9.5), which will allow the result of Theorem 9.2 to be given in a 
closed form. 

LEMMA 10.1. The algebraic Riccati equation (9.5), with U y and V y 0, has a unique 
positive definite solution which is computed as 

l3 = 2V(l„ + ^I,, + 4UVy i . (10.1) 

Proof. Since 13 y 0, then by left multiplying both sides of (9.5) by IS -1 and right multiplying 
them by a matrix 

T:=l3- l V, (10.2) 

the Riccati equation is transformed to 13~ 1 13(I„ + U13)T = O -1 V7\ which is a quadratic equation in 
the matrix T: 

T 2 -T = UV. (10.3) 

The latter can, in principle, be solved by completing the square as T 2 — T = (T — / n /2) 2 — 7„/4, so 
that (10.3) yields 

T = I„/2 + sfTJl + UV=(I„ + s/Tn + 4W)/2. (10.4) 

However, a more rigorous way to arrive at (10.4), which gives the correct meaning to the square root, 
is as follows. The properties 15 y and V y imply that the matrix T in (10.2) is diagonalizable 
and its eigenvalues di,...,d n are all real and positive in view of [16, Theorem 7.6.3 on p. 465]. 
Moreover, 

d k ^l, k = l,...,n. (10.5) 

Indeed, from (9.5) and the condition U y 0, it follows that V y 13, and hence, C := y- 1 / 2 VO" 1 / 2 y 
I„, whereby the eigenvalues of the matrix C are not less than 1. It remains to note that the matrix T 
in (10.2) is related to C by a similarity transformation T = 0" 1/2 0" I/2 yy _1 / 2 \/0 = I5~ l/2 CVU, 
whereby T has the same spectrum as C, thus proving (10.5). Due to its diagonalizability, the matrix 
T can be represented as 

T=EDE~ 1 , D:=diag(4), (10.6) 

where the columns of E are the corresponding eigenvectors of T. Substitution of (10.6) into (10.3) 
yields 

ECIE~ 1 =UV, Cl:=D 2 -D= diag (co k ), d 2 k -d k = (O k . (10.7) 

1 s^k^n 

Hence, the columns of E are also the eigenvectors of UV, which correspond to the eigenvalues 
©i ,...,©„. Since UV is a diagonalizable matrix whose spectrum is all real and nonnegative (in 
view of U y and V y 0), then each of the n quadratic equations in (10.7) has a unique admissible 
solution d k = (1 + \/l +A(O k )/2 which satisfies (10.5). Substitution of these solutions into (10.6) 
yields 

T = -E(l n + diag (Vl+4ffl fc ))£- 1 = (/„ + y/l H + 4UV)/2, (10.8) 
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thus proving (10.4). The second equality from (10.8) was used in the proof of Theorem 9.2 in the 
form of (9.22). Now, (10.2) allows 13 to be uniquely recovered from T as 15 = VT~ l , so that (10.1) 
follows from (10.8). □ 
Substitution of (10.1) into (9.3) leads to an explicit form for the minimum required conditional 
relative entropy supply, computed in Theorem 9.2: 



J t (jK(a,Z),^(P,e)) =(||/3 -A'aW 2 ^ + Tr{U + V- ^I„+4UV) 

+ lndet (/„ + y/l„ + 41/ V) -lndet(2V))/2, (10.9) 

where, as before, the matrices U and V are given by (9.4). In the next section, we will apply 
the representation (10.9) to computing the robustness index Z in (7.13) for the loss functional S 
associated with the second moments of the state variables. 

11. Computing the robustness index for one-step reachable linear systems. Suppose the 
state dimension of the system (8.1) does not exceed the input dimension, that is, n ^ m, and the 
matrix B is of full row rank. Then the one-step reachability Gramian 

ri=BB T (11.1) 

from (8.5) is positive definite, so that T = 1 in (8.22). By Theorem 9.2, the minimum conditional 
relative entropy supply rate /j (<t>,<I>), required for the noise player to maintain such a system in a 
state distribution <t> with mean a G W and covariance matrix E >- 0, satisfies 

/i(*,*)>y 1 ( i #(a,E),^(a,E)) =:/(a,E). (11.2) 

This inequality follows from the remark made at the end of Section 9 and becomes an equality if 
<E> is a Gaussian distribution. The right-hand side of (11.2) is computed by letting t := 1, jS := OC, 
©:=Ein (9.4) and (10.9) as 

7(o,Z) =(||(/ B -A)a||2_ 1 +lndet(r 1 /2) 

+ Tr ((A T r- 'A + 17 1 )E) - IndetE 

+ lndet(/ n + sjl„ + AM) - Tr y/T n + AM) /l, (11.3) 

where M is an (n x n)-matrix which depends quadratically on E through the matrices U and V from 
(9.4) as 

m-.= uv = r7 1/2 AE4 T r^ 1 E17 1/2 u = r~ 1/2 AiA T r~ l/1 , v = i7 1/2 Er7 1/2 . (ii.4) 

Now, consider a particular variant of the robustness index Z in (7.13) associated with the following 
loss functional 

llalllf+TrfllE) 

S(fi,g):= " "^ (nr ^ \ (11.5) 

where a and E are the mean vector and covariance matrix of the state distribution <t>, which is not 
necessarily Gaussian. Here, II is a given real positive definite symmetric matrix of order «, and F is 
the infinite-horizon reachability Gramian from (8.4). The numerator and denominator of the fraction 
in (11.5) are the expectations E(||X^||n) of the state vector^ of the system over<t> and the nominal 
invariant state distribution P* from (8.3), respectively, with n playing the role of a weighting matrix. 
It is assumed that small values of the weighted second moment of the state variables are beneficial 
for system performance under the nominal noise, so that an increase in this moment, described by 
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(11.5), quantifies the deterioration of the system performance when the statistical uncertainty leads 
to a different steady-state distribution <t> ^ P*. Also, Z(y) =0 for all y ^ 1, and the robustness index 
Z(y) is positive for y > 1. Z(y) will be of interest for those (sufficiently large) values of y which 
represent a "critical" level of system performance loss in terms of (11.5). Similar ideas, which are 
concerned with second moment increases in the framework of entropy theoretic formulations of 
uncertainty, can be found in [7, 10, 23, 32, 34, 42, 39, 40]. The following theorem outlines the 
computation of the robustness index being considered here. Its formulation employs a function 

a(z) := (ln(l + s/l+Az) - y/l+Az)' = -2/(1 + VY+A~z) (11.6) 

of a complex variable z. Since a is analytic in a neighbourhood of M+, then a(M) is well-defined 
for the matrix M in (1 1.4) whose eigenvalues are real and nonnegative. In fact, the function a was 
already used in this role in (10.1). 

THEOREM 11.1. Suppose the matrix A in the linear system (8.1) is asymptotically stable and 
the matrix B is of full row rank, that is, rankB = n ^ m. Then for any y ^ 1, the robustness index 
(7.13), which corresponds to the loss functional (1 1.5) with a weight matrix II >- 0, can be computed 
as 

Z(y) =7(0,2*). (11.7) 

Here, J is the function, defined by (11.3), and the matrix >- is a solution to the algebraic 
equation 

e a = (A T r7 1/2 (/„+ycT(M))r- 1/2 A+r7 1/2 (/„ + (j(M)f/)r- 1/2 -An)- 1 , (11.8) 

which is defined in terms of (11.1), (11.4), (11.6) and depends on a scalar parameter X to be found 
from the equation 

Tr(n£ A )/Tr(nr) = y. (11.9) 

Proof. The loss functional E(P*,4>) in (11.5) depends on the state distribution <t> only through 
its first two moments a and E and so does the right-hand side of the inequality in (11.2) which is 
achieved for Gaussian state distributions <t>. Hence, the minimization in (7. 13) can be reduced to the 
class of Gaussian distributions <t> without affecting the minimum value. This allows the robustness 
index Z(y), which corresponds to (11.5), to be computed by solving a constrained optimization 
problem 

Z(y) =min{7(a,E) : a e W, E >- 0, S(a,E) > yTr(nr)/2}, (11.10) 

where 

E(a,L):=(||a|ft + Tr(IE))/2, (11.11) 

and the 1/2 factor is introduced for the sake of convenience. In view of (11.3) and (11.11), the 
Lagrange function for the constrained minimization problem (11.10) takes the form 

T(a,E) :=7(a,E)-AS(a,E) 

=(H° ! ll?/„-AT)r r '(/„-A)^n + lndet ( r i/ 2 ) 
+ Tt((A t T- 1 A +I7 1 - An)E) -IndetE 

lndet(/„ + y/l n + AM) - Tr y/T n + AM) /2, (11.12) 
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where A £ K is a Lagrange multiplier. The dependence of the Lagrange function T on a is quadratic 
and can be decoupled from the dependence on E. The corresponding quadratic form is positive 
definite if and only if 

A < i/p(n(/„-A)- 1 r 1 (/„-A T )- 1 ). 

In this case, min aG K" Y(o:,E) is achieved at the unique point a = 0, so that the minimization of the 
Lagrange function T in (1 1.12) reduces to 

min Y(a,£) =minY(0,E). (11.13) 

ceeR", LyO LyO 

We will now find a stationary point of the function Y(0,E). In view of the identity lndet/V = Tr InN 
for a matrix N with positive real spectrum, the application of [43, Lemma 4] (see also [37, p. 270]) 
yields the following first variation 

5 ( In det(/„ + s/T n + AM) - Tr y/l„ + AM) = Tx{o{M)8M) 1 (11.14) 

where the function <7 is defined by (1 1.6). Since the first variation of the map E h-> M, described by 
(11.4), is 

8M = r i ~ 1//2 A((5E)A T r i " 1 E + IA T Ti 1 <SE)r~ 1 12 , 

then the Frechet derivative of the function in (11.14), as a composite function of the matrix E, can 
be computed as 

<9i ( In det(/„ + V4 + 4M) - Tr y/l„+AM) 

=A T r- 1 Y.r- l/2 a{M)r^ 1/2 A+r^ /2 a{M)r^ 1/2 AiA T r^ 1 
=A T r7 1/ Vcj(M)r7 1/2 A + r7 1/2 (7(M)f/r7 1/2 . (ii.i5) 

The right-hand side of ( 1 1 . 15) is a real symmetric matrix, which inherits its symmetry from E in view 
of the identities o(UV)U = Ua(VU) and Va(UV) = a(VU)V and the symmetry of the matrices U 
and V in (11.4). From (11.15), it follows that the equation <9j;T(0,r) = for a stationary point E of 
the Lagrange function ( 1 1 . 1 2) in the minimization problem (11.13) takes the form 

ATr^A +F7 1 - An -E- 1 +A T r7 1/2 yc7(M)r7 1/2 A +r7 1/2 ( j(M)c/r 1 " 1/2 = o, 

which is equivalent to (11.8). The solution E^ of this equation depends on the Lagrange multiplier 
A, which, by the standard procedure, is to be found from (1 1.9) in accordance with the constraint in 
(11.10). □ 
Note that (11.8) and (11.9) form a complete set of equations for finding the pair (A,E^) for a 
given 7^1. In particular, the solution of these equations for y = 1 is A = and Eo = T, which 
corresponds to the nominal noise model, with Z(l) =0. Properties of the solution for y > 1, includ- 
ing existence and uniqueness, require additional investigation and will be discussed elsewhere. A 
numerical scheme for solving (1 1.8)— (1 1.9) for y > 1 can be based on the ideas of homotopy meth- 
ods, whereby (11.8) is solved iteratively for gradually increasing values of the Lagrange multiplier 
A starting from A = 0. A closed-form calculation of the robustness index for a one-dimensional 
example is given in the next section. 



28 



IGOR G. VLADIMIROV AND IAN R. PETERSEN 



12. Illustrative example: one-dimensional linear systems. In order to avoid reachability is- 
sues for short time horizons f, which are associated with the condition t^n 'm Theorems 8.1 and 9.2 
(or its refined version t ^ T based on (8.22)), consider the one-dimensional case n = m= 1. Here, 
both A and B in (8. 1) are scalars, with |A| < 1 and B ^ 0, and the nominal marginal distribution R of 
the noise in (8.2) is ^K(0, 1). In this case, the variance of the nominal invariant state distribution 
in (8.3) is 



B 2 



1-A 2 ' 



(12.1) 



The equations (8.5) and (9.4) give 



r, = (i-A 2 ')r, u = '—, v = -. (12.2) 

The solution (10.1) of the Riccati equation (9.5) takes the form 

2V 20 
15= ===. (12.3) 

i + ^/\+Wv r, + Vr, 2 +4A 2 '£© 

By substituting these formulae into (9.3) or (10.9), it follows that the minimum required conditional 
relative entropy supply for the noise player to drive the system from an initial state distribution 
<t> := jY (oc,E) to a terminal state distribution *P := jY{$,&) (both with positive variances E and 0) 
in a given time t is 

/ f (*,*) = \ ^(^^) 2 + A^ + 0- v ^F4A^0_ ln ^ (124) 

The minimum conditional relative entropy supply rate J(a,L) in (11.2), required to maintain the 
system in the fixed Gaussian state distribution o/K(a,E), is calculated by letting t := 1, J3 := a, 
© := E in (12.2)-(12.4) which yields 



A 2 'E 



2 1 1 +A 




where 



2y E 
15 = ' , 7:=-- (12.6) 

1-A 2 + v/(l-A 2 ) 2 +4A 2 7 2 ' r 

The discrepancy between jV{aX) and the nominal invariant state distribution P % = .yV(Q,r) with 
variance (12.1) enters (12.5) only through a 2 /T and the variance ratio y in (12.6). In this one- 
dimensional case, the weight II in the loss functional (1 1.5) can be cancelled out and the functional 
takes the form 

S(n,4>) = ^ = ^ + 7 . (12.7) 

In view of (1 1.7) in Theorem 11.1, the robustness index (7.13), which corresponds to (12.7), reduces 
to /(0,E) and is computed by letting a := in (12.5): 



Y>\\ (12.8) 
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see Fig. 12.1. Note that Z(y) vanishes for 7=1 and is strictly decreasing in |A| for any variance 
ratio y > 1 . That is, the less stable the system is, the easier it is for the noise player (in the sense 
of the minimum required conditional relative entropy supply rate) to maintain the system in a state 
distribution <t> with a given larger variance compared to the nominal invariant state distribution P* ■ 
This is in agreement with the intuitive expectation that the deviation of the system from the nominal 
behavior can be achieved by smaller deviations of the noise from its nominal model since their 
accumulation is more efficient if the system is less stable. 
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